Intelligent collections models

ABSTRACT

Apparatuses, computer media, and methods for analyzing credit and tax form data and determining a collection treatment type to collect revenue. A collections model is constructed to determine a collections score that is based on raw credit data and tax form data and is indicative of a debtor&#39;s propensity to pay an owed amount. The collections model includes score bands, each score band being associated with a range of credit scores. A collections score is determined from a scoring expression that is associated with a score band and that typically includes a subset of available raw credit data and tax form data. A collections treatment type is determined from a collections score. Each treatment type corresponds to a treatment action that is directed to the debtor. A collections model is constructed from historical tax data, in which score bands and scoring expressions are constructed for the collections model.

FIELD OF THE INVENTION

This application is a continuation application of U.S. patent application Ser. No. 13/655,747 filed Oct. 19, 2012, naming Inder Preet Singh as the inventor, which is a continuation application of U.S. patent application Ser. No. 13/419,248 filed Mar. 13, 2012, naming Inder Preet Singh as the inventor, which is a divisional of U.S. patent application Ser. No. 12/900,839, filed Oct. 8, 2010, naming Inder Preet Singh as the inventor, which is itself a divisional of U.S. patent application Ser. No. 11/566,787, filed Dec. 5, 2006, naming Inder Preet Singh as the inventor. These applications are incorporated herein by reference in their entirety and for all purposes.

BACKGROUND OF THE INVENTION

Revenue agencies typically have more accounts to be collected than resources to collect and resolve the accounts. Historically revenue agencies work all accounts through a single, inflexible workflow with little consideration to the debtor's willingness or ability to pay. Decisions to use outside collections services occur at the end of the process at which time the accounts are stale.

A revenue agency typically utilizes a FICO score, which is a credit score developed by Fair Isaac & Co. Credit scoring and is a method for determining the likelihood that credit users will pay their bills. Fair, Isaac began its pioneering work with credit scoring in the late 1950s and, since then, scoring has become widely accepted by lenders as a reliable means of credit evaluation. A credit score attempts to condense a borrower's credit history into a single number. However, Fair, Isaac & Co. and the credit bureaus do not reveal how the credit scores are computed. The Federal Trade Commission has ruled this approach to be acceptable. Credit scores are calculated by using scoring models and mathematical tables that assign points for different pieces of information which best predict future credit performance. Developing these models involves studying how thousands, even millions, of people have used credit. Score-model developers find predictive factors in the data that have proven to indicate future credit performance. Models can be developed from different sources of data. Credit-bureau models are developed from information in consumer credit-bureau reports.

Credit scores analyze a borrower's credit history considering numerous factors such as:

Late payments

The amount of time credit has been established

The amount of credit used versus the amount of credit available

Length of time at present residence

Employment history

Negative credit information such as bankruptcies, charge-offs, collections, etc.

There are typically three FICO scores that are computed by data provided by each of the three most prevalent credit bureaus: Experian, TransUnion, and Equifax. Some lenders use one of these three scores, while other lenders may use the middle score.

The use of a credit score to determine the propensity to pay is inflexible in altering the collections model. A revenue agency, for example, may wish to tailor its collection model to better fit available data. Moreover, a revenue agency can customize its collection practices to more effectively use collections resources and to identify those accounts that will require private collections services early in the process.

BRIEF SUMMARY OF THE INVENTION

Embodiments of invention provide apparatuses, computer media, and methods for analyzing raw credit data and tax form data to determine a collections score that is indicative of debtor's (tax filer's) propensity to pay an owed amount to a revenue agency.

With one aspect of the invention, a collections model is formed from raw credit data, tax form data, and credit scores. The collections model includes a plurality of score bands, in which a score band is associated with range of credit scores.

With another aspect of the invention, a collections score is determined from a scoring expression that is associated with each score band. The scoring expression typically includes a subset of available raw credit data and tax form data. A scoring expression that is associated with a score band may utilize different variables than another scoring expression that is associated with another score band.

With another aspect of the invention, a collections treatment type for a debtor is determined from a collections score. The collections treatment type may be independent of the score band of the debtor. Each collections treatment type corresponds to a treatment action that is directed to the debtor. Moreover, the collections treatment type for a given collections score range may be modified if the revenue agency wishes to alter the collections model.

With another aspect of the invention, a collections model is constructed from historical tax data. A plurality of score bands is constructed for the collections model, where a different scoring expression is associated with each score band.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example and not limited in the accompanying figures in which like reference numerals indicate similar elements and in which:

FIG. 1 shows an architecture of a computer system used in a multi-lingual telephonic service in accordance with an embodiment of the invention.

FIG. 2 shows a process for modeling revenue collections in accordance with an embodiment of the invention.

FIG. 3 illustrates a process for assigning a debtor to a score band in accordance with an embodiment of the invention.

FIG. 4 shows variables for scoring in a first score band in accordance with an embodiment of the invention.

FIG. 5 shows variables for scoring in a second score band in accordance with an embodiment of the invention.

FIG. 6 shows variables for scoring in a third score band in accordance with an embodiment of the invention.

FIG. 7 shows variables for scoring in a fourth score band in accordance with an embodiment of the invention.

FIG. 8 shows variables for scoring in a fifth score band in accordance with an embodiment of the invention.

FIG. 9 shows variables for scoring in a sixth score band in accordance with an embodiment of the invention.

FIG. 10 shows a process for determining a collections score for a debtor in accordance with an embodiment of the invention.

FIG. 11 shows a process for determining a collections treatment type from a collections score in accordance with an embodiment of the invention.

FIG. 12 shows an apparatus that analyzes raw credit data and tax form data to initiate a collections treatment action in accordance with an embodiment of the invention.

DETAILED DESCRIPTION OF THE INVENTION

Elements of the present invention may be implemented with computer systems, such as the system 100 shown in FIG. 1. Computer 100 may be incorporated in an apparatus (as shown in FIG. 12) that analyzes input data and consequently initiates a collections treatment action for collecting revenues. Computer 100 includes a central processor 110, a system memory 112 and a system bus 114 that couples various system components including the system memory 112 to the central processor unit 110. System bus 114 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. The structure of system memory 112 is well known to those skilled in the art and may include a basic input/output system (BIOS) stored in a read only memory (ROM) and one or more program modules such as operating systems, application programs and program data stored in random access memory (RAM).

Computer 100 may also include a variety of interface units and drives for reading and writing data. In particular, computer 100 includes a hard disk interface 116 and a removable memory interface 120 respectively coupling a hard disk drive 118 and a removable memory drive 122 to system bus 114. Examples of removable memory drives include magnetic disk drives and optical disk drives. The drives and their associated computer-readable media, such as a floppy disk 124 provide nonvolatile storage of computer readable instructions, data structures, program modules and other data for computer 100. A single hard disk drive 118 and a single removable memory drive 122 are shown for illustration purposes only and with the understanding that computer 100 may include several of such drives. Furthermore, computer 100 may include drives for interfacing with other types of computer readable media.

A user can interact with computer 100 with a variety of input devices. FIG. 1 shows a serial port interface 126 coupling a keyboard 128 and a pointing device 130 to system bus 114. Pointing device 128 may be implemented with a mouse, track ball, pen device, or similar device. Of course one or more other input devices (not shown) such as a joystick, game pad, satellite dish, scanner, touch sensitive screen or the like may be connected to computer 100.

Computer 100 may include additional interfaces for connecting devices to system bus 114. FIG. 1 shows a universal serial bus (USB) interface 132 coupling a video or digital camera 134 to system bus 114. An IEEE 1394 interface 136 may be used to couple additional devices to computer 100. Furthermore, interface 136 may configured to operate with particular manufacture interfaces such as FireWire developed by Apple Computer and i.Link developed by Sony. Input devices may also be coupled to system bus 114 through a parallel port, a game port, a PCI board or any other interface used to couple and input device to a computer.

Computer 100 also includes a video adapter 140 coupling a display device 142 to system bus 114. Display device 142 may include a cathode ray tube (CRT), liquid crystal display (LCD), field emission display (FED), plasma display or any other device that produces an image that is viewable by the user. Additional output devices, such as a printing device (not shown), may be connected to computer 100.

Sound can be recorded and reproduced with a microphone 144 and a speaker 146. A sound card 148 may be used to couple microphone 144 and speaker 146 to system bus 114. One skilled in the art will appreciate that the device connections shown in FIG. 1 are for illustration purposes only and that several of the peripheral devices could be coupled to system bus 114 via alternative interfaces. For example, video camera 134 could be connected to IEEE 1394 interface 136 and pointing device 130 could be connected to USB interface 132.

Computer 100 can operate in a networked environment using logical connections to one or more remote computers or other devices, such as a server, a router, a network personal computer, a peer device or other common network node, a wireless telephone or wireless personal digital assistant. Computer 100 includes a network interface 150 that couples system bus 114 to a local area network (LAN) 152. Networking environments are commonplace in offices, enterprise-wide computer networks and home computer systems.

A wide area network (WAN) 154, such as the Internet, can also be accessed by computer 100. FIG. 1 shows a modem unit 156 connected to serial port interface 126 and to WAN 154. Modem unit 156 may be located within or external to computer 100 and may be any type of conventional modem such as a cable modem or a satellite modem. LAN 152 may also be used to connect to WAN 154. FIG. 1 shows a router 158 that may connect LAN 152 to WAN 154 in a conventional manner.

It will be appreciated that the network connections shown are exemplary and other ways of establishing a communications link between the computers can be used. The existence of any of various well-known protocols, such as TCP/IP, Frame Relay, Ethernet, FTP, HTTP and the like, is presumed, and computer 100 can be operated in a client-server configuration to permit a user to retrieve web pages from a web-based server. Furthermore, any of various conventional web browsers can be used to display and manipulate data on web pages.

The operation of computer 100 can be controlled by a variety of different program modules. Examples of program modules are routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. The present invention may also be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, network PCS, minicomputers, mainframe computers, personal digital assistants and the like. Furthermore, the invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.

An embodiment of the invention supports the development of unique analytic models to score debtors(i.e., tax filers who owe money) with outstanding accounts receivable that are owed to government revenue agencies. The scores generated by the collections model represent the propensity of a debtor to pay and also provide insight into the level of effort that will be required to collect the debt by the revenue agency. Collection models may blend demographic and financial information maintained by the revenue agency with commercial data that is reflective of a debtor's ability to pay and credit history. While a revenue agency is typically a governmental organization, revenue collections can be performed by a private organization that has been contracted by a government (Federal, state, or local). In such a case, required tax and credit information is made available to the private agency with proper security measures.

With the prior art, collections models for revenue agencies typically use only internal revenue agency data. With an embodiment of the invention, collections models involve the blending of the internal revenue agency data and the use of commercial financial and credit data. The final collections model may provide a significant improvement in identifying receivables that debtors are more likely to pay during the collections process. The final collections model is typically more predictive compared to FICO-only model as well as tax data-only model. Both tax form data and credit data are often very predictive in explaining payment behavior. Those who have good credit history are also good tax payers. For example, the ratio of tax still owed and income (corresponding to ratio_taxowed_ctincome as will be discussed) is a predictive tax variable—those with higher ratio are less likely to pay.

FIG. 2 shows process 200 for modeling revenue collections in accordance with an embodiment of the invention. Process 200 demonstrates quantitative benefits of using a collections model for prioritizing receivable cases. A collections model is built from developed datasets. With an embodiment of the invention, process 200 provides a test-deploy collections model as a proof-of-concept for developing a business case for a state government.

With module 201, client customer data is blended with credit history data and other data as required to fulfill the specific requirements of a collections model. In an embodiment of the invention, Module 201 extracts historical individual tax data for the State of Connecticut (CT) in the 2002 and 2003 tax years. Payment behavior is primarily modeled on 2003 tax data to predict payment in the 2003 year. Prior tax year's (2002) Paid/Not-Paid flag is also used for additional predictive power.

These data are combined in a database record called the Customer Analytic Record (CAR) by module 203. U.S. Pat. No. 7,047,251 and U.S. application Ser. No. 11/147,034, to Kenneth L. Reed, et al., ('251 and '034, respectively) are incorporated herein by reference. The '251 and '034 references disclose a system and method for creating virtual flat customer records derived from database customer data that may be used as standardized input for analytical models. A Customer Analytic Record (‘CAR”) application may be created as a database object to extract, transform, and format the customer data needed for customer segmentation and predictive modeling. The CAR may be a set of database “views” that are defined using virtual stored queries and enabled using capabilities of a data base management system and a structured query language. The CAR is typically a virtual ‘flat” record of the customer data needed for customer analytics. The customer data may be extracted by running one or more SQL queries against the database view(s). The CAR application may dynamically calculate additional variables using predetermined transformations, including custom transformations of an underlying behavior. If additional variables are created, the CAR may be modified to include the additional variables. The CAR is often a dynamic view of the customer record that changes whenever any update is made to the database. The definition of the CAR provides documentation of each data element available for use in models and analytics.

Module 203 creates a CAR table that is used as model input data set to drive the modeling effort. (With an embodiment, module 203 determined tax-filers who owed $50 or more on the cutoff date. The tax filers who owed less than $50 were dropped to provide sharper contrast.) Module 203 rolls up (accumulates) transactional tax data for the identified tax filers (e.g., until the cutoff date of Jul. 15, 2004) to one record per tax filer and creates derived variables-like ratios. Inferred “Goods” (Payers) correspond to tax filers who paid in a performance window of 9 months and “Bads” (Non-Payers) correspond to tax filers who did not pay in the performance window. Module 203 appends credit attributes to each record. (In an embodiment, more than 850 credit attributes provided by TransUnion were appended, in which TransUnion was able to match 98% of names for credit data.)

Module 205 provides address hygiene on the historical tax data (e.g., for the years 2002 and 2003) so that latest and correct address information is associated with the names of tax payers. In an embodiment of the invention, a data provider e.g., Acxiom Corporation, verifies address information with the names of the identified tax payers. Enhanced address accuracy and completeness via Acxiom's address hygiene process typically results in improved targetability. Name and address information is then sent to a credit bureau, e.g., TransUnion for credit information. Credit information may include credit scores and raw credit information. Because historical tax information is being analyzed, the credit information typically corresponds to the same timeframe (e.g., for the years 2002 and 2003 in this example).

Module 207 obtains the raw credit data, historical tax data, and credit scores from module 205 to form a collections model using an application developed on the CAR. (Raw data, sometimes called source data or atomic data, is data that has not been processed for meaningful use and that has been collected but not formatted or analyzed. Raw data often is collected in a database, where the raw can be analyzed and made useful for an application.) Modeling activities begin after CAR is available. Preliminary data analysis for basic checks and data validity may be performed. With an embodiment, module 207 performs decision tree segmentation using a statistical analysis package to analyze credit scores (e.g., SAS/STAT software) to find sufficiently differentiated segments (score bands) and creates a separate segment model for each score band (segment), thus increasing the overall predictive power.

The collections model may be dynamically retrained prior to use in order to capture the latest information available. This approach is different from the typical static credit model approach where the models and the data variables are held constant. In this case, the collections model and the data are allowed to change.

Module 207 creates a collections model using tax-return and credit data that will identify and rank all future receivables on a likelihood of payment during collections process. Collections scores generated by the collections model will be used to rank receivables—a higher score implies that creditor is more likely to pay compared to creditor with a lower score. On the basis of collections scores, differentiated collections treatments can be designed and optimized over time for each risk score band of the collections model.

With an embodiment of the invention, segment modeling is performed using a KXEN data mining tool. The KXEN tool divides data into estimation (75%) and validation (25%) sub-samples, where validation results verify robustness/stability of the collections model. The KXEN tool differentiates between behavior of “good” and “bad” tax filers. The KXEN tool mines more than 1,000 tax and credit variables and identified attributes that are predictive in explaining payment behavior. The KXEN tool generates automated final model equations (scoring expressions) that is used to score tax filers who still owe tax-dues to find individuals who are most likely to pay owed amounts. With an embodiment of the invention, a scoring expression is a statistical regression equation determined by the statistical tool. The regression equation typically includes only the relevant variables from more than 1000 mined variables.

Module 209 tests and verifies the collections model developed by module 207. In an embodiment, module 209 extracts receivables for the 2004 tax year and determines the collections scores using the collections model. Treatment actions based on the determined treatment type are directed test groups. The “Goods” (those who pay) and the “Bads” (those who do not pay within a predetermined time duration (performance window)) are measured.

One the collections model has been developed by module 207 and verified by module 209, module 211 prepares the collection model for the targeted revenue agency. For example, the collection model may be implemented as a computer-readable medium having computer executable instructions and distributed to a revenue agency over a secure communications channel (e.g., LAN 152 as shown in FIG. 1) or as an apparatus that utilizes a computer platform, e.g., computer 100.

FIG. 3 illustrates process 300 for configuring a plurality of score bands in a collections model in accordance with an embodiment of the invention. In an embodiment, process 300 is performed by module 207 as shown in FIG. 2. A sampled population 350 of debtors (using historical tax data as previously discussed) is analyzed to configure a plurality of score bands (segments) in accordance with desired statistical characteristics. The tree based algorithm finds the top variable which divides the debtors into segments with similar percentage of “goods” and “bads.” Sampled population includes a combination of “Goods” (21966 debtors or 74%) and “Bads” (7727 debtors or 26%). As will be further discussed, the debtors are assigned to one of the score bands based on credit score 351 (NA2OJTOT) that is built and produced by TransUnion (TU). However, other embodiments may use other scores, e.g., another credit score or a customized score that is determined from a combination of tax form data and raw credit data.

Each debtor of the sampled population of debtors is assigned to one of six score bands (segments) based on the associated credit score 351. Debtors that satisfy criterion 301 (NA201TOT<491.5) are assigned to score band 1, Debtors that satisfy criteria 303 and 305 (491.5<=NA201TOT<525.5) are assigned to score band 2, and debtors that satisfy criteria 303 and 307 (525.5<=NA201TOT<581.5) are assigned to score band 3. Similarly, debtors are assigned to score bands 4, 5, and 6 that satisfy criteria 309, 311 and 313, respectively.

FIGS. 4-9 show configurations for segment models for each of the score bands that are determined by process 300 as performed by module 207 when constructing a collections model. As previously discussed, a scoring expression is determined for each score band (segment). Even though over a thousand credit and tax variables are available, the scoring expressions shown in FIGS. 4-9 are limited to twenty variables in order to reduce calculations for determining a desired collections objective. In general, a scoring expression (given that the j^(th) score band is selected) may be expressed as:

$\begin{matrix} {{collections\_ score} = {\sum\limits_{i = 1}^{N}\; {W_{i,j} \times V_{i,j}}}} & \left( {{EQ}.\mspace{14mu} 1} \right) \end{matrix}$

where N is the numbers of variables used in a scoring expression, W_(i.j) is the weight for the i^(th) variable of the j^(th) score band, and V_(i.j) is the value of the i^(th) variable of the j^(th) score band.

With an exemplary embodiment of the invention, module 207 selects 20 variables for each scoring expression. However, with other embodiments module 207 may select a different number of variables, where the variables vary with different scoring expressions.

FIG. 4 shows scoring expression 400 for the first score band as shown in FIG. 3 in accordance with an embodiment of the invention. Scoring expression 400 utilizes twenty variables selected from over one thousand raw credit data and tax form data. For example, variable 401 (ratio_taxowed_ctincome) is considered as having the greatest importance and is accordingly given the greatest weight 405 (17.9%). Variable 403 (RE36) has the next greatest importance and is given weight 407 (7.7%).

FIG. 5 shows scoring expression 500 for the second score band as shown in FIG. 3 in accordance with an embodiment of the invention. Scoring expression 500 utilizes twenty variables selected from over one thousand raw credit data and tax form data. For example, variable 401 (ratio_taxowed_ctincome) is considered as having the greatest importance and is accordingly given the greatest weight 503 (13.1%). Variable 501 (PS230) has the next greatest importance and is given weight 505 (7.5%). In the exemplary embodiment, scoring expressions 400 and 500 have one common variable (variable 401) with the remaining variables being different (e.g. variables 403 and 501).

FIG. 6 shows scoring expression 600 for the third score band as shown in FIG. 3 in accordance with an embodiment of the invention. Scoring expression 600 utilizes twenty variables selected from over one thousand raw credit data and tax form data. With an embodiment of the invention, the majority of the variables of scoring expression 600 are different from the variables of the other scoring expressions 400, 500, 700, 800, and 900,

FIG. 7 shows scoring expression 700 for the fourth score band as shown in FIG. 3 in accordance with an embodiment of the invention. Scoring expression 700 utilizes twenty variables selected from over one thousand raw credit data and tax form data. As shown in FIGS. 4-9, variable 401 (ratio_taxowed_ctincome) is commonly used by scoring expressions 400-900. Moreover, sonic of the variables of scoring expression 700 may be used by some of the other scoring expressions. For example, variable 701 (home_ownership) is used by scoring expression 400 but not by the other scoring expressions.

FIG. 8 shows scoring expression 800 for the fifth score band as shown in FIG. 3 in accordance with an embodiment of the invention. Scoring expression 800 utilizes twenty variables selected from over one thousand raw credit data and tax form data. The fifth score band contains debtors having a very low credit risk with a small proportion of “Bads.”

FIG. 9 shows scoring expression 900 for the sixth score band as shown in FIG. 3 in accordance with an embodiment of the invention. Scoring expression 900 utilizes twenty variables selected from over one thousand raw credit data and tax form data. The sixth score band contains debtors having the lowest credit risk with almost no “Bads.”

As previously discussed, a collections model is constructed as shown in FIGS. 2-9. The collections model can then be used by a revenue agency to determine and initiate collections treatment for debtors.

FIG. 10 shows a process 1000 for determining a collections score for a debtor in accordance with an embodiment of the invention. The collections scores, as generated by collections models, enable revenue agencies to better align workload with workforce and other available resources. Enhanced efficiency is accomplished by prioritizing accounts based upon the collections score. Accordingly, the most likely to pay receive “softer” collection approaches and the least likely to pay receive more assertive treatments earlier in the collections process. The prioritization of accounts identifies the most difficult debtors to collect accounts. These accounts can be forwarded to private collections services at the onset when these accounts are still fresh. It is expected that using the collections score to prioritize and assign accounts may increase revenue derived from accounts receivable collections by 3% to 7%.

Procedure 1001 obtains a credit score for a debtor after the collections model has been constructed by process 200 (as shown in FIGS. 2 and 3). In an embodiment of the invention, NA201TOT is a credit score that is built and produced by TransUnion (TU) and that is utilized in an embodiment of the invention. (TransUnion is a credit bureau as previously discussed.) NA201TOT is also called TU New Account Score. As performed by procedure 1003, a tax filer is classified into one of six segments on the basis of their NA201TOT score. Each of the six segments (score bands) has a separate model equation (scoring expression). Procedure 1005 uses the associated scoring expression to determine the collections score. If a debtor is assigned to segment ‘2’ on the basis of debtor's NA201TOT score, then collections model ‘2’ equation is used to determine the collections score for the debtor. With an embodiment of the invention, procedure 1007 determines the collections treatment type that is based on a debtor's collections score (also called ATCS score), irrespective of the debtor's segment score band) assignment. In an embodiment, if two debtors have the same collections score but are assigned to different segments, the collections treatment type is the same. (However, embodiments of the invention may associate different collections treatment types for the same collections score for different score bands, i.e., the collection treatment type may be dependent on the score band.) As an example, debtor₁₃ 1 has an ATCS score of 0.88. Debtor_(—)2 has an ATCS score of 0.14. Debtor_(—)1 has a high score, i.e., is very likely to pay any owed amount, so the revenue agency just sends a notice letter (Treatment Type A). Corresponding action actions are initiated from the determined treatment type. Debtor2 has a low score, i.e., is not likely to pay, so the revenue agency sends the debtor a strongly worded letter. If no payment is received within 21 days, for example, the revenue agency sends another strong letter. If payment still not received after second reminder, the revenue agency refers debtor_(—)2 to a debt collector. (Treatment Type C) An exemplary collection rule set is:

-   -   If ATCS>=0.75 then initiate treatment A     -   If 0.4<=ATCS<0.75 then initiate treatment B     -   If ATCS<0.4 then initiate treatment C         Collections score bands and treatments may continuously change         and improve over time. For example one may “tweak” treatment         type A. As another example, one may change the cutoff from 0.75         cutoff to 0.7). With the above embodiment, NA201TOT is used for         scoring any debtor. Using NA201TOT provides additional power to         collections models. However, embodiments of the invention may         build models without NA201TOT. For example, a collections score         may he determined from a combination of tax form data and raw         credit data. Procedures 1001-1007 are repeated if additional         debtors are to be processed as determined by procedure 1009.

FIG. 11 shows process 1007 (as shown in FIG. 10) for determining a collections treatment type from a collections score in accordance with an embodiment of the invention. In step 1101 of the collections score (as determined by procedure 1005) is greater or equal to 0.75, collection treatment type_A 1103 is selected. In step 1105, the collections score is between 0.75 and 0.4, collection treatment type_B 1107 is selected. Otherwise, collection treatment type_C 1109 is selected.

FIG. 12 shows apparatus 1200 that analyzes raw credit data and tax form data to initiate a collections treatment action in accordance with an embodiment of the invention. Model analyzer 1201 constructs a collection model using historical tax data performing process 200 as previously discussed. Model analyzer 1201 provides the configuration for a plurality of score bands (segments) and associated scoring expressions to scoring analyzer 1203. Scoring analyzer 1203 consequently determines the collections score for the debtor being processed. Treatment analyzer 1205 determines the collection treatment type from the collections score. Consequently, treatment generator 1207 initiates treatment action (e.g., letters to debtors and instructions to a debt collector) to the directed debtor.

As can be appreciated by one skilled in the art, a computer system (e.g., computer 100 as shown in FIG. 1) with an associated computer-readable medium containing instructions for controlling the computer system may be utilized to implement the exemplary embodiments that are disclosed herein. The computer system may include at least one computer such as a microprocessor, a cluster of microprocessors, a mainframe, and networked workstations.

While the invention has been described with respect to specific examples including presently preferred modes of carrying out the invention, those skilled in the art will appreciate that there are numerous variations and permutations of the above described systems and techniques that fall within the spirit and scope of the invention as set forth in the appended claims. 

1. (canceled)
 2. A computer-implemented method comprising: generating training data for training a predictive model to estimate a likelihood of a debtor to pay a debt, the training data including, for each of a plurality of debtors, (i) historic tax return data for the debtor, (ii) commercial financial or credit data that is usable to calculate a credit score for the debtor, and (iii) a label indicating a likelihood of the debtor to pay a debt; training the predictive model using the historic tax return data, the commercial financial or credit data, and the labels included in the training data; after training the predictive model, identifying a particular debtor; obtaining (i) historic tax return data for the particular debtor, and (ii) commercial financial or credit data that is usable to calculate a credit score for the particular debtor; providing, to the predictive model, (i) the historic tax return data for the particular debtor, and (ii) the commercial financial or credit data that is usable to calculate a credit score for the particular debtor; and obtaining, from the predictive model, an indication of a likelihood of the particular debtor to pay a debt.
 3. The method of claim 2, wherein the historic tax return data comprises one or more values were entered on a personal income tax form by an associated debtor.
 4. The method of claim 2, wherein the debt comprises a debt owed to a government revenue agency.
 5. The method of claim 2, wherein the historic tax return data comprises data that is available only to a government revenue agency or to designated agents of the government revenue agency.
 6. The method of claim 2, wherein the historic tax return data comprises data that is not reflected on a credit report of an associated debtor.
 7. The method of claim 2, wherein the historic tax return data comprises one or more values that reflect an amount of tax listed as due on a tax return in relation to an amount of income listed on the tax return.
 8. The method of claim 2, wherein the historic tax return data comprises one or more values that characterize a status of a previous year's tax return.
 9. The method of claim 2, wherein the historic tax return data comprises one or more values that reflect an amount of tax listed due on a tax return in relation to an amount of tax listed as owed on the tax return.
 10. The method of claim 2, wherein the historic tax return data comprises one or move values that reflect a tax penalty amount listed on a tax return.
 11. The method of claim 2, wherein the historic tax return data comprises one or more values that reflect an amount of time after a tax deadline in which a tax return was filed.
 12. The method of claim 2, wherein the historic tax return data comprises one or more values that reflect an amount of tax that a tax return lists as owed.
 13. A computer-readable storage device encoded with a computer program, the program comprising instructions that, if executed by one or more computers, cause the one or more computers to perform operations comprising: generating training data for training a predictive model to estimate a likelihood of a debtor to pay a debt, the training data including, for each of a plurality of debtors, (i) historic tax return data for the debtor, (ii) commercial financial or credit data that is usable to calculate a credit score for the debtor, and (iii) a label indicating a likelihood of the debtor to pay a debt; and training the predictive model using the historic tax return data, the commercial financial or credit data, and the labels included in the training data.
 14. The device of claim 13, wherein the historic tax return data comprises one or more values were entered on a personal income tax form by an associated debtor.
 15. The device of claim 13, wherein the debt comprises a debt owed to a government revenue agency.
 16. The device of claim 13, wherein the historic tax return data comprises data that is available only to a government revenue agency or to designated agents of the government revenue agency.
 17. The device of claim 13, wherein the historic tax return data comprises one or more values that reflect an amount of tax listed as due on a tax return in relation to an amount of income listed on the tax return.
 18. The device of claim 13, wherein the historic tax return data comprises one or more values that reflect an amount of tax listed due on a tax return in relation to an amount of tax listed as owed on the tax return.
 19. The device of claim 13, wherein the historic tax return data comprises one or move values that reflect a tax penalty amount listed on a tax return.
 20. The device of claim 13, wherein the historic tax return data comprises one or more values that reflect an amount of time after a tax deadline in which a tax return was filed.
 21. A system comprising: a processor configured to executed computer program instructions; and a computer storage medium encoded with computer program instructions that, when executed by the processor, cause the system to perform operations comprising: obtaining a predictive model that is trained to estimate a likelihood of a debtor to pay a debt, wherein the predictive model is trained using training data that includes, for each of a plurality of debtors, (i) historic tax return data for the debtor, (ii) commercial financial or credit data that is usable to calculate a credit score for the debtor, and (iii) a label indicating a likelihood of the debtor to pay a debt; identifying a particular debtor; obtaining (i) historic tax return data for the particular debtor, and (ii) commercial financial or credit data that is usable to calculate a credit score for the particular debtor; providing, to the predictive model, (i) the historic tax return data for the particular debtor, and (ii) the commercial financial or credit data that is usable to calculate a credit score for the particular debtor; and obtaining, from the predictive model, an indication of a likelihood of the particular debtor to pay a debt. 